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Decoding of Expander Codes 
at Rates Close to Capacity 

Alexei Ashikhmin and Vitaly Skachek 

Abstract — The decoding error probability of codes is studied as a 
function of their block length. It is shown that the existence of codes with 
a polynomially small decoding error probability implies the existence of 
codes with an exponentially small decoding error probability. Specifically, 
it is assumed that there exists a family of codes of length TV and 
rate R = (1 — e)C (C is a capacity of a binary symmetric channel), 
whose decoding probability decreases polynomially in 1/jV. It is shown 
that if the decoding probability decreases sufficiently fast, but still only 
polynomially fast in 1/N, then there exists another such family of codes 
whose decoding error probability decreases exponentially fast in TV. 
Moreover, if the decoding time complexity of the assumed family of 
codes is polynomial in TV and 1/e, then the decoding time complexity of 
the presented family is linear in N and polynomial in 1/e. These codes 
are compared to the recently presented codes of Barg and Zemor, "Error 
Exponents of Expander Codes," IEEE Trans. Inform. Theory, 2002, and 
"Concatenated Codes: Serial and Parallel," IEEE Trans. Inform. Theory, 
2005. It is shown that the latter families can not be tuned to have 
exponentially decaying (in TV) error probability, and at the same time to 
have decoding time complexity linear in N and polynomial in 1/e. 

Index Terms — Concatenated codes, decoding complexity, decoding 
error probability, error exponent, expander codes, IRA codes, iterative 
decoding, LDPC codes, linear-time decoding. 



I. Introduction 

A classical work of Shannon states that reliable communications 
over a communication channel can be achieved for all information 
rates which are less than the certain threshold rate, capacity, which 
is a function of the channel characteristics. Codes and decoding 
algorithms that attain the channel capacity were extensively studied 
over the last decades. For such codes with respective decoding 
algorithms, at rates less than the capacity, the probability of decoding 
error approaches zero, as the code length grows. 

Fastness of decrease of the decoding error probability as a function 
of the code length, TV, is a characteristic of capacity-approaching 
codes, which was widely studied for many code families. However, 
this probability depends also on ratio between the channel capacity 
and an actual code rate. Namely, let the code rate be R — (1 — e)C, 
where C is the channel capacity. It is an interesting question to ask 
is how the decoding error probability depends on e. 

Another characteristic of (decoding algorithms of) codes is a 
time complexity of decoding. As of yet, there are known families 
of capacity-achieving codes (over various channels) with decoding 
algorithm time complexity only linear in TV. However, one might 
look onto the decoding time complexity of code families in terms of 
e. In the next two paragraphs we discuss these characteristics for two 
code families. 

It is known that LDPC-type codes can attain a capacity of a binary 
erasure channel (BEC), the reader can refer to [11], [13], [15]. It is 
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generally believed that LDPC-type codes can approach capacity of a 
variety of other communication channels. However, it is also believed 
that the decoding error probability decreases only polynomially with 
the code length. As to the decoding time complexity, it was con- 
jectured in [9] that per-bit complexity of message-passing decoding 
(e.g. [6], [16]) of LDPC or irregular repeat accumulative (IRA) codes 
over any 'typical' channel is O (log -) + O (- log where ir is a 
decoded error probability. Lately, for LDPC-type codes with message- 
passing decoding over the BEC, the time complexity was shown to 
be linear in a code length and sub-linear in 1/e. More specifically, 
it was shown in [11] and [13] that the decoding complexity per bit 
for some sub-families of LDPC-type codes behaves as 0(log(l/e)). 
Recently, in [14], IRA codes with bounded decoding complexity per 
bit were constructed. 

In contrast, modifications of expander codes presented in [1], 
[2], [3], [17], [18] also attain the capacity of the memoryless q- 
ary symmetric channel, and the error probability decreases ex- 
ponentially with the code length. Several recent works were de- 
voted to analysis of fraction of errors that expander codes can 
correct (e.g. [4], [20], [21], [22]) and their rate-distance trade-offs 
(see [3], [8], [18]). While it is well known that there are decoders for 
expander codes having linear-time (in the code length) complexity, 
the dependence of this complexity on 1/e was not studied. In the 
present work, we aim at studying this dependence. We investigate 
time complexity of decoding algorithms of expander codes in terms 
of e, in particular for the codes in [1], [3]. We show that these specific 
codes have time complexity that is exponential in 1/e 2 . 

In this work, we study capacity-achieving codes over a binary sym- 
metric channel (BSC). We show that if there exists a family of codes 
C ln of length TV and rate R — (1 - e)C (C is a BSC capacity), with 
the decoding probability vanishing inverse polynomially in TV and 
e (under conditions of our theorem), then there exists another such 
family of codes C CO nt with the decoding error probability vanishing 
exponentially in TV. Moreover, if the decoding time complexity of 
the codes d n is polynomial in TV and 1/e, then the decoding time 
complexity of the codes C co ?it is linear in TV and polynomial in 1 /e. 

The structure of this paper is as follows. In Section [TTJ we describe 
the basic ingredients in our construction. The main result of our paper 
appears in Section [III] we present a sufficient condition for existence 
of a family of codes with the decoding error probability vanishing 
exponentially fast. We also analyze the decoding time complexity of 
the presented codes. Finally, in Sections IIVI and [V] we show that 
the codes in [1], [3] with their respective algorithms cannot be tuned 
to have decoding error probability that decreases exponentially fast 
(in terms of TV), while the respective decoding algorithms have time 
complexity linear in TV and polynomial in 1/e. 

II. Preliminaries 
A. Capacity-achieving codes with fast decoding 

In this subsection we assume existence of some (family of) linear 
code dn, which achieves the capacity C of the BSC, and which has 
fast decoding algorithm. We denote its rate Ri n = (1 — e)C, and its 
length riin (constant for a fixed e). Below, we discuss the parameters 
of this code. 

Decoding complexity: we assume that the decoding complexity of 
Cm over the BSC is given by 

oLi n -^j , (i) 

where s, r > 1 are some constants. Let T> in be a decoder that have 
a time complexity as in Q. 
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Based on the results in [11], [13], [14], several LDPC-type code 
families (with respective message-passing decoding algorithms) do 
have such decoding complexity over the BEC (for s = 1). There 
are no such results known for the BSC, although in the light of the 
surveyed works, this assumption sounds reasonable for LDPC-type 
codes over the BSC. 

Decoding error probability: as of yet, there are no satisfying results 
on asymptotical behavior of the decoding error probability of LDPC- 
type codes over the binary erasure channel under the message-passing 
decoding, for rates near capacity of the BEC. The behavior of the 
decoding error probability of LDPC-type codes over other channels is 
even less investigated. In this work, we obtain a sufficient condition 
on the probability of the decoding error Prob e (Ci n ) of the decoder 
T>i n (for the d n ) to guarantee the existence of a code with an 
exponentially-fast decreasing error probability. 
Note: the results presented in the sequel are valid for any code d n 
whose decoding time complexity and error probability are as stated 
above. However, LDPC-type codes are very promising candidates to 
meet these conditions, and in fact we do not see any other candidate 
at the present moment. Since there is no such candidate, it makes 
sense to speak about LDPC-type codes in this context. 

B. Nearly-MDS expander codes 

In this section, we consider linear-time decodable codes of rate 
1 — e (for small e > 0) that can correct a fraction i9e b of errors, 
where i? > 0, b > are constants. There are several code families 
known to date that can be shown to have the above property, and at 
the same time allow a linear-time (in a code length) decoding. In this 
connection, the reader can refer to [1], [3], [8], [20], [22]. However, 
as of yet, the codes in [17], [18] have the best relations between 
their rate, distance and alphabet size among all known expander-based 
linear-time decodable codes. Moreover, unlike the codes in [17], [18], 
not all aforementioned codes have decoding time complexity, which 
is polynomial in 1/e. 

Below, we recall the construction in [17], [18]. Let Q = (A : B,E) 
be a bipartite A-regular undirected connected graph with a vertex set 
V = AU B such that A n B = and | A\ = |S| = n, and an edge 
set E of size N = An such that every edge in E has one endpoint 
in A and one endpoint in B. For every vertex u £ V , denote by 
E(u) the set of edges incident with u, and assume some ordering 
on E(u), for every u G V . Let F = GF(q) be some finite field, and 
q > A. 

Take Ca and Cb to be Generalized Reed-Solomon codes with 
parameters [A, taA, SaA] and [A, rs A, 5g A] over F, respectively. 
(We use notation [n, k, d] for a linear code of length n, dimension 
k, and minimum distance d.) We define the code C = [Q, Ca ■ Cb) 
as in [18], namely 

C = jc G F^ : (c) B ( u ) G Ca for every it G A 

and (c)e(u) £ Cb for every u G B} , (2) 

where (x)^( u ) denotes the sub-word of x = (i e )e6B G F^ that is 
indexed by E(u). The produced code C is a linear code of length N 
over F. 

Let <E> denote the alphabet F"" aA . Taking some linear one-to-one 
mapping £a '■ $ — > Ca over F, and the mapping ip : C — * $ n given 

by 

lf>(c) = (^A 1 {{c) E (u))) ueA , C G C , 

the authors of [18] define the code C<s> of length n over $ by 
Ci. = {ip(c) : c G C} . 



Definition. An infinite sequence {a;}^!, a* — —>■ +oo, a; G M, is 
called a dense sequence of values if a\ < 100 and a;+i — a* = o(ai) 
(for i — » oo). (The number 100 is a large absolute constant, the 
condition a\ < 100 ensures that not all elements in the sequence are 
exponentially large.) 

Let \g be the second largest eigenvalue of the adjacency matrix 
of Q and denote by 75 the value Xg/A. When Q is taken from a 
family of A-regular bipartite Ramanujan graphs (e.g. [10], [12]), we 
have 

\g < 2VA- 1 . (3) 

There are explicit constructions for such A-regular Ramanujan graph 
families for dense sequences of values A ([10], [12]). 

It was shown in [18], that the code C$ has the relative minimum 
distance 

Sb -fgV5 B /S A 

0* > : ■ (4) 

1 - 7e 

It is also known that the rate of C$ is 

R<s> > ta + Tb — 1 ■ 

The linear-time decoding algorithm in Figure Q] was proposed 
in [18]. It corrects any pattern of /j, errors and p erasures such that 
u + < fin, where ft is given by 

p = (fe/2)- 7 g\/W^ 
1-75 

The number of iterations m in the algorithm was established in [18] 
such that m = O(logn). The notation "?" is used for erasures, and 
the notations T> a and T>b are used for decoders of the codes Ca and 
Cb, respectively. 



Input: received word y = (y u )ueA in (<I> U {?})". 
For u G A do ( Z ), M { 
For i <— 1, 2 m do I 

If i is even then X = A, V = V A , 
else X = B, V = V B . 

For ugIdo(2)s(,)f-%)g W ). 

} 

Output: ip(z) if z G C (and declare 'error' otherwise). 



Fig. 1 . Decoder of Roth and Skachek for the code Cj, . 

The proof in [18] requires that the decoder Da is a mapping F A — > 
Ca that recovers correctly any pattern of less than 5 a A/2 errors 
over F, and the decoder Db is a mapping (F U {?}) A -> C B that 
recovers correctly any pattern of 9 errors and v erasures, provided 
that 28 + v < (5sA. The decoders T>a and T>b are polynomial- 
time, for example Berlekamp-Massey decoder can be used for both 
of them. It can be implemented then in 0(A 2 ) time (or less). 

In the next proposition, we show that the parameters of the codes 
in [18] of rate 1 — e can be tuned to correct #e errors for a constant 
■&> 0. 

Proposition 1: For any e G (0, 1), and for a sequence of 
alphabets such that the sequence {log 2 is dense, 

the codes C$ (as above) of rate R$ > 1 — e (with decoder 2?$) can 
correct a fraction #e of errors, where $ > is some constant. 

Proof. There is a dense sequence of values A G {Ai}f^ ± such 
that there exists a family of A-regular bipartite Ramanujan graphs Q 
(see [10], [12]). For any such value A, we can take both codes Ca 
and Cb to be GRS codes of length A over alphabet of size A, rate 
v a = tb = 1 — e/2 and relative minimum distance 5a — Sb = e/2. 
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Consider a code C$ defined with respect to these Ca and Cb- The 
rate R$ of C<j satisfies > ta + tb — 1 = 1 — e. From {5), the 
fraction of errors that the decoder D4. can correct is given by 



5b/2 - 1q\J5 b /5 a 

1-75 
> e/4 - 7e 



= e/4 - 2VA - 1/A 
> 6/4-2/VA. 

Take any A such that A > (16/e) 2 : for such A, 

/3 > tfe , where ■& = 1/8 . 

Next, we observe that = A^* rA . Based on the density 

of {Ai}°^ 1 , we show the density of the sequence {log 2 l^l}?^. 
Indeed, for any i G N, 



lim 

i — >oo 



log 2 |$i+i| -log 2 |$i| 



= lim 

i — >oo 



lim 



log 2 |**| 
Aj+i log 2 Aj+i - Aj log 2 Aj 



Aj log 2 Aj 
Aj+i log 2 Ai+i 



-^00 I A,: log 2 Aj 



Uu| A 8 +o(AQ _ log 2 (Ai + o(A0) _ l 



A, 

= 1-1=0. 



log 2 Aj 



Finally, from [10] and [12], Ai can be taken small enough, such that 
log 2 |$i| < 100, as required. □ 



C. Concatenated codes 

In this subsection, we revisit the definition of concatenated codes. 
The following ingredients will be used: 

• A linear [ru„, fci„=J?i„nj„] code d n over F (inner code). 

• A linear code C$ of length n and rate R$ over $ = F fe "» (outer 
code). 

• A linear one-to-one mapping £ : $ — > dn- 

The respective concatenated code Ccont is defined as 

Ccont = |(ci|c 2 | ■ ■ ■ |c n ) G F n ' n *" : Cj = £o(3») , 
for i G 1, 2, • • • , n, and (H1H2 • ■ • H n ) G C$| . 

The rate of C CO nt is known to be R CO nt = Rin ■ R<s>- 

Let V ln : F"" 1 -* C in and £>$ : $ n -* C$ be decoders for the 
codes d n and C$, respectively. A simple decoder T> cont for the code 
C corl t is presented in Figure [2] There exist more advanced decoders 
for the code Ccont (e.g. GMD decoding, [5]) that can correct more 
errors, but we consider the decoder T> con t due to its simplicity. 



Input: received word y — (t/i y 2 
, n do 



Vn-n i 



in F" 



For i e 1,2, • _ 

Ui <- f _1 (An ( (j/ J + (i-l).n,„)"il\ ))■ 

Let (Zl«2 • ■ • Zn) ((wi«2 • • • u„)). 

Output: (fo(2i)|fo(z2)| • ■ • \£o(z n )). 



Fig. 2. Decoder T> con t for the code C CO n 



III. Main results 

A. General settings 

Consider a memoryless binary symmetric channel with crossover 
probability p. Its capacity is given by C = 1 — H 2 (p), where H 2 (x) = 
— xlog 2 x — (1 — x) l°g 2 (l — x) ' s me binary entropy function. Let 
R = C(l — e)bea design rate. 

Take F to be GF(q), q = 2 e , I G N. Let C 4 „ be a binary code 
of length riin assumed in Section III-AI It can also be seen as an 
additive linear code of length rij n = m n /i over F. Let C$ be a 
linear code of length n and rate over an alphabet $ = W Rin " in , 
Pick some linear one-to-one mapping £ : $ — > Cj n - Let C CO nt 
be a code, corresponding to a concatenation of the code d n (as 
an inner code) with the code C* (as an outer code), as defined in 
Section lTl-CI Suppose R con t > R is a rate of the (binary) code C CO nt 
and N CO nt = n ■ rii n is its length. Denote by Prob e (C CO nt) its error 
probability, under the decoding by T> con t- 

The following lemma is based on the result in [5, Chapter 4.2]. 
Lemma 2: The error probability of the code C CO nt (as defined in 
this section) under the decoding by T> con t, when the error probability 
of the decoder T>i n for the code Cm is Prob e (Ci n ), and the decoder 
D$ corrects any pattern of less than /3n errors, is bounded by 

f E 

Probe(Ccont) < exp{— n ■ E} = exp < —N con t ■ 

( Tlin 

where E is a constant given by 

E = -j31n(Prob e (Cin))-(l-/3)ln(l-Prob e (Ci„)) 

+/31n(/3) + (l-/3)ln(l-/3) . (6) 

If a right-hand side of ® is negative, we assume that E is zero. 

The proof of this lemma appears in Appendix A. 

Remark. It is possible to improve an error exponent by a constant 
factor if allowing the decoder for the code Cm to put out an "erasure" 
message in a case of unreliable decoding of the code C m , See [5, 
Chapter 4.2] for details. We omit this analysis for the sake of 
simplicity. 

B. Sufficient condition 

In this subsection, we derive a sufficient condition on the proba- 
bility of decoding error of the code d n for providing a positive error 
exponent for the code C CO nt as defined in subsection IIII-AI Below, 
we use the notation d n [Rin, riin] for the code d n of rate R in and 
length n» n . 

Theorem 3: Consider the BSC, and let C be its capacity. Sup- 
pose that the following two conditions hold: 

(i) There exist constants b > 0, 1? > 0, £1 G (0, 1), such that for 
any e, < e < ei, and for a sequence of alphabets 

where the sequence {log 2 j^l}^ is dense, there exists a family 
of codes <Cj> of rate 1 — e (with their respective decoders) that 
can correct a fraction i3e b of errors. 

(ii) There exist constants £2 G (0, 1) and ho > 0, such that for any 
e, < e < £2 , the decoding error probability of a family of 
codes dn satisfies 



Probe Ci 



C'M> 



< e 



Then, for any rate R < C, there exist a family of the codes C con t 
as defined in subsection IIII-AI (with respective decoder) that has an 
exponentially decaying (in Ncont) error probability. 
Proof. Let R = (1 — e)C be a design rate of the code C con t, and 
£ > be small (namely, e < min{£i,£2}). Let k be a constant, 
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< k < 1, which will be defined later, and let the rate of the code 
Ci n be R in — (1 — ke)C. We set the rate of C$ as 

R* = -§- = = 1 - (1 - k)s - 6(e 2 ) . 

Rin 1 — K£ 

Then, by condition (i), the fraction (3 of errors correctable by the 
code C$ is at least f3 > i?((l - n) ■ e) b . 

For an alphabet $, the length m n of the code d n is given by 

_ log 2 l$l 

'"in — D 

"in 

We select the smallest $ 6 {(fri}^ such that 

1 



log 2 |$| > 



and, so, 



1 — , (7) 

Next, we use Lemma [2] to evaluate the decoding error probability 
of the code Ccont- It holds for small positive values of j3 that 

(l-/3)In(l-/3) >-/?, 

and thus, from Lemma [2] we obtain (by ignoring the positive term 
-(1 - /3) ln(l - Prob e (e in )) in ©), 

Prob e (C C ont) 

< exp{-n- (-/? In (Prob e (C ln )) + /3ln(3-/3)} 

= expl-iVeoni— (ln/8- In (Prob e (Cin)) - 1) 

( ^in 

In order to have a positive error exponent, we require that 
In/3- In (Prob e (&„))- 1 >0, 

or, equivalently, 

P > e- Probe(C ln ) . (8) 
The decoding error probability of the selected code d n satisfies: 
Probe (C in [(1 - «e)C, n in ]) 



< Probe C 



(1 - ne)C, 



(Ke) h o 



where the first inequality is due to (0, the second inequality follows 
from condition (ii), and the third inequality can be satisfied by a 
selection of a small constant k such that K b < ■&(! — re) b /e. 

The inequality ® implies (8), as required. □ 

Example. Suppose that the decoding error probability of the code 
d n of rate Ri n = (1 — e)C and length m n (for some decoder) is 
bounded by 

1 1 

Probe (Cin) < 1 ■ 

1T>in £ 

We choose ho = b+5 (where b is as in condition (i) of Theorem[3}. 
There obviously exists £2 such that for every < e < £2, for the 
code dn of length nt n = l/e h ° and rate Rt n = (1 — e)C, 



Probe (dn) <—-\ = e ho -\ = 



e b+1 < e b 



(10) 



From the expression dlOt we see that condition (ii) of Theorem [3] 
is satisfied. This selection guarantees existence of a positive error 
exponent for the code C CO nt. 

Example. Suppose that the decoding error probability of the code 
dn (of rate Ri n = (1 — e)C and length m n ) is bounded by 

Probe (Cin) < e-"'" £2 . 



We choose ho = 3. There obviously exists £2 such that for every 
< e < £2, for the code d n of length riin = l/e h ° and rate 
Rin = (1 — e)C, and for every b > 0, 



Probe (Cin) < e ™ inE = e 



-(e 2 / e J ) 



-(Ve) 



and therefore Theorem [3] yields existence of a positive error exponent 
for the code C CO nt- 

C. Example 

In this subsection, we consider a specific case of decoding error 
probability for the code d n . Theorem [3] can be directly applied 
in this case. However, we conduct a direct minimization of the 
decoding error probability of the code C CO nt, which is obtained by 
concatenation of the code C$ in [18] with the assumed code d„, 
and obtain an analytical expression on the error exponent. We show 
that the overall decoding error probability for this code C CO nt has a 
positive error exponent. 

Suppose that the decoding error probability for some inner code 
dn over the binary symmetric channel with crossover probability 
p < H 2 " 1 (l — Rin) and some polynomial decoder is given by: 

Prob e (C m ) < -j-, 

Kn 

where t is a constant, t > 1. 

Below, we make a selection of parameters for the code C CO nt- 
This selection allows us to estimate a decoding error exponent as a 
function of e. 

Let 7? = (1 — e)C be a design code rate. Pick the rate of d n to 
be Rin = (1 — ks)C, where k € (0, 1) is a constant. Then, we can 
write 

c(1 ~ e) > i-(i- K ) £ -e(£ 2 ) 



R 



Rin C(1-ks) 
Next, we select the parameters of the code C$ in [18], which serves as 
an outer code. Take Ca and Cb as GRS codes over F, with |Fj — A. 
We fix S B = 1 - R/Rin -5a = - R/Rin), where 77 £ (0, 1) 
(and thus, 8a = (1 — ??)(1 — R/Rin)), and select the degree A of 
the graph Q as A = g/e 2 , where g is a constant, such that 

16 



77(1 -r,)(l- K y 



(9) We have, 



R* > r A + r B - 1 = 1 - 5a - 8 B = R/Ri 
By our selection (see l(3j), 



2 2e 
VA 



We obtain from {4](, 



/3 > (<5 s /2) - 'Jq^Sb/Sa > ^£ + o(e) , 



(11) 



where 



is a constant which depends only on ft, 77 and g. 

The number of bits needed to represent each symbol of $ is 
log 2 |<&| = r. 4 A • log 2 |F|. Recall that r a = 1 - 0(e). Therefore, the 
length m n of the binary code d n is given by 

THn = -5 log 2 (A) 



(1 - 0(£))e log 2 (J 



ftn £ 2 

52 ( 
ftn£ 2 



£ , log 2 (/?/£ 2 ) . „ (g\og 2 (g/e 2 ) 



Rin £ 2 



(12) 
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and thus, by ignoring the small term, the decoding error probability 

of C in is t 

Prob e (C in ) < ( f R ™ ) . (13) 

We substitute the expressions in i ll It (only the main term) and J 1 3 1 > 
into the result of Lemma [2] to obtain 



Prob e (C cont ) < 

exp < — n I — $£ • t In 



£ 2 Rin 



glog 2 (g/e 2 



(1 - 0e) In 1 



£ -Rin 



^log 2 (£>/e 2 ) 
+ i?e In (i?e) + (1 - ife) In (1 - -de) | }■ . (14) 



Note that for small e > 0, 



ln(l-tfe) =-i?e + 0(e 2 ) , 

S 2 Rin 



and 

1 I 1 — i £ 

^ Uiog 2 (e/e 2 ) 

Hence, the equation d 1 4b (when neglecting o(e) terms) becomes 
Prob e (C cont ) < 



exp < — ntfe I — t In 



£ Rin 



f?log 2 (£>/£ 2 

+ In (#e) - 1 

Ncon^S _ ^£- g t (log 2 ( g /£ 2 )) t 



exp 



' "'' ' Hin '"V e ■ £ 2t 
Using substitution of the expression l !12t for m n , the latter equation 
can be rewritten as 

Prob e (C cont ) < 

_ N co „t&E ■ £ 2 Rin 

2g (log 2 (l/£) + e(l)) 
(2t-l)ln(l/ E )+tln(l/-Ri„) 

+ tlnln(l/ e ) + G(l)j|. (15) 

The dominating term in the expression 

(2t- l)ln(l/e)+tln(l/J2in)+tlnln(l/e)+e(l) 

is (2i - 1) ln(l/e). By taking into account that R in = (7(1 - 0(e)), 
the equation dl5t can be rewritten, when ignoring all but the main 
term, as 



Probe (Ccont) < 

exp <^ - N cont ■ ( ^ r + o(£ 



V 2£i ■ log 2 e 
Thus, the decoding error probability is given by 

Prob e (C CO nt) < exp{-N co „ t ■ E(C,e)} 

where 

£((7, £) = max < — } ■ — - — • e 



g,0 I g J 2 ■ log 2 e 

max{^-^-2./^Z 

», v, a y 2g y ^(1 — 77) 

(2t-l)C 3 

£ 5 



2 ■ log 2 e 



(16) 



and the parameters (k, 77, g) are taken over 

k £ (0, 1) ; 77 G (0, 1) ; g > 



16 



Ij(1-7j)(1-k)" 

Next, we optimize the value of the constant 



(17) 



T = max < — — 2 

2g 



n 



Q s (l- V ) 



It is easy to see that the maximum is received for k — > 0. We 
substitute k = in expression d 1 6h to obtain 



(18) 



By taking a derivative of T over g and comparing it to zero, we 
obtain that 

36 



77(1-77) ' 

By substituting it back to the expression j I St and finding its maxi- 
mum, we have 77 = 2/3 and g = 162. These values obviously satisfy 
condition J17b . The appropriate value of T is then 



2g 



- 2 



V _ 2/3 
g 3 (l-rj) 2-162 



- 2 



2/3 



162 3 • (1/3) 



1 = 6.8587 ■ 10" 4 



1458 
Finally, we have 

£(C7 ' £) -2916.1og 2 e' e ' 

Figure [3] shows value of error exponent E(C,e) in the example 
for t = 1, 2 and 3. 




Fig. 3. Error exponent E(C, e) for the code C con t- 
Selection: Prob e (C in ) = l/n\ ; C = 0.8; t = 1, 2, 3 (bottom to top). 



D. Decoding complexity 

In this subsection, we show that under the assumption in Sec- 
tion III-AI on the decoding time complexity of the code Cm, and if 
the parameters of the codes are selected as in the proof of Theorem [3] 
then the decoding time complexity of the respective code C con t is 
linear in the overall length N cont and inverse polynomial in the gap 
from capacity e. 

Theorem 4: Consider the BSC, and let C be its capacity. Let 
R — (1 - e)C be a design rate. Suppose that the following two 
conditions hold: 

(i) Let C$ be a (family of) code defined in Section III-BI of rate 

R<s> = (1 — e)/(l — ks), K £ (0,1) is a constant, over a 
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smallest alphabet $ satisfying log 2 |$| > 1/(k£) h ° from a 
dense sequence {log 2 l^l}"^, and ho > is a constant, 
(ii) Let dn be a code of rate Ri„ = (1 — ne)C with the decoding 
complexity over the BSC of capacity C given by 



O 



where s, r > 1 are some constants. 
Then, the time complexity of the respective code C CO nt, when 
decoded by D cont , is given by 

N cont ■ POLY(l/e) . 

Proof. Below we count the total number of operations when 
decoding the code C CO nt by the decoder D con t. There are two main 
steps. 

• Step 1: n applications of the decoder D in on the binary word 
of length 

• Step 2: one application of the decoder D<s> on the word of length 
n over 

In addition, there are n applications of each of the mappings £q and 

f -i 

We separately count the number of operations during each step. 

• Step 1: By the assumption on the decoding complexity of T>i n , 
n applications of this decoder result in time 



O 



O N, 



(19) 



From the definition of ( 



n in = log 2 |3>| / R in , so, we have 

logo 1*1 



(l-rce)C 



By using the density of values of log 2 |$|, we have log 2 <P G 
POLY(l/e), thus yielding n in G POLY(l/e). By substitution 
into l |19t , we obtain that the time complexity of Step 1 is N cont ■ 
POLY(l/e). 

Step 2: it is shown in [18] that the number of applications of 
decoders Da and Pb on the word of Cj> of length n over <E> is 
bounded by uj ■ n, where 



hi 



In 



( A&M 

( 5a8b 



1 + - 



1 - 



V SaSb 



and a is an actual number of errors in the word. Thus, if the ratio 
a /p is bounded away from 1, and Q is a Ramanujan graph, then 
the value of ui is bounded from above by an absolute constant 
(independent of A). 

The decoders Da and Db are applied on the words of length 
A G POLY(l/e). When half minimum distance decoders for 
GRS codes are used, their complexity is polynomial in 1/e, 
Therefore, the decoding complexity in Step 2 is bounded by 

n ■ Poly(1/ £ ) < N cont ■ Poly(1/e) . 

Each application of mapping So or Sq 1 is equivalent to multiplica- 
tion of a vector by a matrix, where the number of rows and columns 
in the matrix is POLY(l/e). This can be done in time POLY(l/e). 

Summing up the decoding complexities of all steps of the decoder, 
we obtain that the total number of operations is bounded by 

N cont • POLY(l/e) . 



Note. The result in Theorem[4]is still valid if the outer code C<s> be 
replaced by any other code of rate 1 — 6(e), whose decoding time 
complexity is linear in n and polynomial in 1/e, for a log-dense 
sequence of alphabet sizes. 

IV. TIME COMPLEXITY OF DECODER IN [1] 

Similarly to Section|nI] assume in this and the next sections that C 
is the capacity of the BSC with crossover probability p, and the design 
code rate is R = (1 — e)C Our purpose is to compare the parameters 
of the codes from Section [TIT] with codes presented by Barg and 
Zemor in [1] and [3] (with their respective decoding algorithms). In 
the sequel we show that the parameters of the codes from [1] and [3] 
cannot be modified such that the decoding time complexity would be 
only sub-exponential in 1 /e while keeping a non-zero error exponent. 
The reason is this: both decoding algorithms in [1] and [3] make use 
of sub-routines (decoders for small constituent codes) that have time 
complexity exponential in a degree of underlying expander graph. 
This degree, in turn, depends (at least) polynomially on 1/e. 

A. Construction 

We briefly recall the construction and the decoder in [1], Let Q — 
(A : B, E) be a bipartite A-regular undirected connected graph with 
a vertex set V = A U B such that A n B — and |A| = \B\ = n, 
and an edge set E of size iV = An such that every edge in E has 
one endpoint in A and one endpoint in B. 

Let the size of the finite field F be a power of 2. Let Ca and Cb be 
two random codes of length A over F. The code Cbz2 = (G,Ca ■ 
Cb) is defined similarly to the definition of C in l(2j, with respect to 
Ca and Cb as defined in this paragraph. 

B. Decoding 

Let us submit a word c = (c e ) ee _E G Cbz2 to the BSC. Assume 
that y = (y e ) eg _E is a received (erroneous) word. A formal definition 
of the decoder D bzi appears in Figure|4] The number of iterations m 

Input: Received word y = (y e ) e6 _E in F^. 

Let z <— y. 

For i <— 1, 2 m do { 

If i is odd then X = A, D = D A , 
else X = B, D = D B . 

For u G X do (z)e(u) <- 2?(0)e(«)). 

} 

Output: z if z G Cbz2 (and declare 'error' otherwise). 



Fig. 4. Decoder T>bz2 of Barg and Zemor for the code Cbz2- 

is taken to be 0(log n). The decoders Da and Db are the maximum- 
likelihood decoders for the codes Ca and Cb, respectively. 

The analysis of codes in [1] is divided into two cases. In the first 
case, the codes Ca and Cb over F = GF(2) are considered. In the 
second case, the analysis is generalized toward field sizes, which are 
large powers of 2. We analyze these two cases separately. 

C. Analysis: binary codes 

In the binary case, following the analysis of [1] it is possible to 
show that for the code Cbz2 with the decoder Dbz2, the decoding 
error probability, Prob e (Csz2), is bounded by 



□ 



Prob e 



,p)<e X p{-aNf 3 (R,p)} , 
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where < a < 1, and the main term of f3(R,p) is less or equal to 

and Eo(Ro,p) is the random coding exponent for rate _Ro over the 
BSC with a crossover probability p. 

Proposition 5: If the codes Cbz2 (binary, as assumed in this 
subsection), have a positive error exponent under the decoding by 
O SZ2 , then A = n(l/(H2 1 (e)) 2 ). 

Proof. In order to have a positive error exponent it is needed that 

H^-^)_ e a)>o, 

Observe that R ~ R < C - R = Ce < e. It follows from i20t 
that 

iHJ x ( £ ) > ^(Ro - R) > G (i/Va) , 

and thus A = (l/(H2 1 (e)) 2 ). □ 

It is suggested in [1] to use the maximum-likelihood decoding 
for random codes Ca and Cb- This decoding, however, has time 
complexity at least 

e X p{tt(A)} = e XP {fi (l/(H 2 - 1 (e)) 2 )} . 

D. Analysis: codes over large fields 

Suppose that the size of the field F is a large power of 2. In this 
case, for the code Cbz2 under the decoding by T>bz2, the decoding 
error probability Prob^Cs.^) is bounded by 

Prob e (Csz2,p) < exp{-aNf 2 (R,p)} , 

and the main term of f2(R,p) is less or equal to 

In this case, Proposition [5] can be rewritten as 

Proposition 6: If the codes Cbz2 (over large F, as assumed in 

this subsection) have a positive error exponent under the decoding 

by V B Z2, then A = Q, (l/e 2 ). 

The proof is very similar to that of Proposition [5] 

When using the maximum-likelihood decoder for random codes 

Ca and Cb, the decoding time complexity is at least 

exp{fi(A)} = exp{Q (l/e 2 )} . 

V. Time complexity of decoder in [3] 
A. Construction 

Recall the construction of expander codes presented in [3]. Let 
Q = (V,E) be a bipartite graph with V = Vo U (Vi U V 2 ), such 
that each edge has one endpoint in Vo and one endpoint in either Vi 
or V2- Let \Vi\ = n for i = 0, 1,2. Let the degree of each vertex 
in Vo, Vi, and V2 be A, Ai, and A2 = A — Ai, respectively. In 
addition, let the subgraph Q\ induced by VbUVi be a regular bipartite 
Ramanujan graph and denote by E\ its edge set. Let Ai be a second 
largest eigenvalue of the adjacency matrix of Q\. 

Let Ca be a [IA, RolA, do — IA80] linear binary code of rate 
Ro = Ai/A. Let C B be g-ary [Ai,iiiAi,di = AiA] additive 
code, and let q — 2 l . Let C a ux be g-ary code of length Ai. The 
code Cbz3 is defined as the set of vectors x = {x\, X2, ■ ■ ■ ,xn}, 
indexed by the set E of size N = An, such that 

1) For every vertex v G Vo, the subvector (xj)j E E(v) ls a Q' 
ary codeword of Ca and the set of coordinates Ei(v) is an 
information set for the code Ca- 



2) For every vertex v g Vi, the subvector (xj)jEE(v) ' s a 9-ary 
codeword of Cb- 

3) For every vertex v G Vo, the subvector (xj)j e E 1 (v) is a 

Codeword Of Caux- 

B. Decoding 

The authors of [3] proposed decoding algorithm for the code Cbz3- 
In the first iteration, each subvector z(v), v £ Vo, is treated as 
following: the decoder computes, for every symbol b of the g-ary 
alphabet, and for every edge e £ E± incident to v, the weight of the 
edge as follows: 

d e ,b(z) = min d(a,z(v)), 

CLGCa :a e — b 

where a e denotes the g-ary coordinate of the codeword a that 
corresponds to the edge e, and d(-, ■) is the binary Hamming distance. 
This information is passed along the edge e to the corresponding 
decoder on the right-hand side of the bipartite graph. In the second 
iteration, for every vertex w € Vi the right decoder associated to it 
finds a g-ary codeword b — (61, ... , 6a x ) £ Cb that satisfies 

A, 

b = are 



1 I— 1 



and writes bi on the edge w(i), i = 1, . . . , Ai. 

Then, the decoder continues similarly to the decoder in [1]. 

C. Analysis 

Lemma 7: Let p satisfy < p < h, and let < e <siC p. Then, 

e(l-H„(p)) 



H 2 - i (H 2 (p) + e (l-H 2 (p))) =p- 



log 2 ((l-p)/p) 



£ 2 (l-H 2 (p)) 2 log 2 e 
2p(p-l)(log 2 ((l-p)/p)) ; 



+ 0(e 3 ). 



The proof of this lemma appears in the Appendix B. 

Proposition 8: Let C be the capacity of the BSC. The decoding 
error probability of a random code of rate R — (1 — s)C, under 
the maximum-likelihood decoding, behaves as exp{— ©(e 2 )} when 
e -*■ 0. 

Proof. We start with the well-known expression for the probability 
exponent of the decoding error of a random code under the maximum- 
likelihood decoding [6], [7]. 

E (R,p) = 

T(S,p) + R-l if R c „t < R < C 

1 - log 2 + v / 4pCT — P)) - R if Rrmn <R< Rcrtt 

-<51og 2 ^/4p(l-p) if < R < R mm , 

where R m in and R cr it are some threshold rates, 
S = S G v(R) = H^(l-R) , 

and 

T(x, y) = -x log 2 y-(l-x) log 2 (l - y) . 

At the code rates R which are close to C, the relevant expression for 
random coding exponent becomes 

E (R,p)=T(5,p)+R-l. (21) 

Next, we express all terms of the relevant part of d21b in terms of 
e. We recall, that R = (1 - e)(l - H 2 (p)) and, thus, 

HJ^l -R) = H^e + H a (p) - eH 3 (p)) • 
Thus, when disregarding 0(e 3 ) term, the equation J2U becomes 
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Lemma [7] the equality ( 122b becomes 



E (R,p) = M(R,p) = 

fl-glfl- Ho(v))- 1 (v I ^^W) 1 E 2 (l~H 2 (p)) 2 log 2 e \ fl x 

1 ° A , n2[ ~ P " 1 log 2 ((l-p)/p) 2 ptp-lXlog^tl-pj/p))^'- 1 P- 1 

/l „ e(l-H 2 (p)) . 1 e 2 (l-H 2 (p)plog 2 e ^ 

V 1 V log 2 ((l-p)/p) 2 pfp-lXlogattl-pJ/p)))^; 



+T(H 2 1 (£ + H 2 (p)-eH 2 (p)), p) '"^ f , „ .-: : .L:,.n : ; i II, l-:.c 

<=> - £ -(i- £ )H 2 ( P ) + rfp + 1 £ ( 1 - H2 ^ / \ 



log 2 ((l -p)/p) 



2p(p-l)(log 2 ((l-p)/p)) 
-e-(l-e)H 2 (p)- (p + 



e 2 (l — H (p)) 2 log e \ When ignoring the terms of e 2 and highest powers of e, and denoting 

— J ^ J — — 3 , p j 61= log^f-pvi) ' this equation becomes 



£ (1-H a(p) ) M(R,p) = lo g2 ( I ^-^) + ^ 2 ) 

£ 2 (l-H 2 (p)) 2 log 2 e \ 



2p(p-l)(log 2 ((l-p)/p)) 3 



log2P = log 2 ((l+e/p)(l + e/(l-p))) + 0(e 2 ) 

log 2 (l + 0/p + 0/(l-p)) + O(0 2 ) . 



1-P- 



o2V 

e(l-H 2 (p)) 

log 2 ((l — p)/p) Using Taylor's series for ln(-) around 1 we obtain 



2p(p-i)(iog 2 ((i-p)/p)ry Vp (!-p)/ 

l0 S2 e z> , 



-e(l-H 2 (p)) P(l-P) 

e(l ^ H 2 (p))(-log 2 p + log 2 (l -p)) and switching back to e notation this becomes 



log 2 ((l -p)/p) 



log 2 e e(l - H 2 (p)) : / 



£ 2 (i-H 2 (p)) 2 iog 2 e(iog 2 p-io g2 (i-p)) M(fl, P ) = ^n~^) ' w m - S + ° {£ ) = 9(£) ' (23) 

2p(p-l)(log 2 ((l-p)/p)) 3 PK W S2U 

£ 2 fl — H (V)) 2 log e Next, we evaluate the value of a. Recall that a > 2Ai/di, and 

' " J - - 2 ■ c P , di < Ai < A. We have 



2p(l-p) (log 2 ((l -p)/p)) 2 



u , , 2Ai 4VAi - 1 ^ 4^A^T ^ / 1 
where c p > is a constant that depends only on the crossover a > > > = O 



di ~ Ai A V^A 



probability p of the channel. Note that the transition (*) follows 

from Lemma [JJ □ In order to have a positive error exponent it is necessary that 

Proposition 9: If the codes Cbzs have a positive error expo- Ea(Ro v) 

nent, then A = Q(l/e 2 ). E (R , P ) - Ma > =► M >° 

Proof. It is shown in [3] that the decoding error probability of the Eo(Ro,p) q ( 1 



code Cbzs, Prob e (Csz3), satisfies M VVA 

Prob e (Csz3) < exp {-nA/<5i(l + a) -1 Using Proposition [U E {Ro,p) = 6(e 2 ), and thus from <(23j 

•(£o(iio,p)-Ma)(l- (l))} ) E = n(l/VA) => A = fi(l/ £ 2 ). 

where a is a constant defined in [3] (in paritcular, 1 > a > 2Ai/di), |— j 
and 

Assuming that the first two decoding iterations are as suggested 

§log 2 ((l -p)/p) if R<R crU in [3], we conclude that the time complexity of the decoding is 

^{ \?-*om ) *R>Rcru ' expMA)} = expMl/e 2 )}. 

Sgv(R) = H 2 _1 (l — R) is the Gilbert- Varshamov relative distance APPENDIX A 
for the rate R, and R cr u = 1 — H 2 (po) is a so-called critical rate, 

where p = ^p/{^p + v / l _r p) (see [3] for details). Proof of Lemma EJ 

We are interested in small values of e, i.e. R > R crit . In this case, We analyze the error exponent, following the guidelines of the 

the value of M{ R, p) can be rewritten as analysis of Forney [5, Chapter 4.2]. Let ft, i = 1, • • • , n, be a random 

variable which equals 1 if no inner decoding error is made while 

p\ — i g ( Sgv(R)(l — p) \ decoding i-th inner codeword, and —1 otherwise. The outer code 

\ (1 — Sgv{R))p J will fail to decode correctly if and only if 



(l-H^Cl- R)) P ) c=~^>< (1-2/3) 



/ H 2 - 1 (H 2 (p) + £ - £ H 2 (p))(l-p) \ 
82 V(l-H 2 - 1 (H 2 (p) + £-£H 2 (p)))py' ' Denote 



n ■ 



where the last transition is due to R = (1 — H 2 (p))(l — e). Using m(~ s ) = m (Prob e (Ci n ) • e s + (1 — Prob e (Ci n )) ■ e s ) 
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Using the Chemoff bound, we obtain 

Prob e (C») = Prob (i^Q < (1-2/3) 



< e 



\ i=l 
-n(s(2/3-l)- M (-s)) 



we obtain two solutions for the intermediate x, namely 



.<■ = - I - 2p(p - 1) In ' 1 '" 



Optimization of the exponent over values of s yields that the 
maximum of the expression 



P J log 2 e 



8(2/3 - 1) - 



-p(P — 1) In 



1-p 



is achieved when 



i (l-Prob e (C iTI ))-2/3 
S 2 n Prob e (C„0-(2-2/3) ' 

and the maximum is 

s(2/3 - 1) - fj,(- 8 ) = -/31n(Prob e (Ci„)) 

- (l-/3)ln(l-Prob e (C in )) 
+ /3 In 08) + (1-/3) In (1-/3) , 

thus completing the proof. □ 
Appendix B 

Proof of Lemma |7j 

Consider the value of the binary entropy function at the point p+x 
for small x > 0. Using Taylor series around point p, 

H 2 (p + x) = H a (p) + H 2 (p) ■ ^ + ~H 2 '(p) • x 2 + 0(x 3 ) . 
By calculation of the derivatives of the entropy function, one obtains 



H 2 (x) = -log 2 X-X- - -log 2 e + log 2 (l-x) 



Pip - 1) In 



l-p\\ 20p(p-l) 



P 



log 2 e 



however, only one of these solutions is positive: 
x = — pip — 1) In ' 



+ W(p(,-l)ln(^V + 2 ^- 1) 



P 

The later equality can be rewritten as 

x = p(j) — 1) In ' ' 



log 2 e 



l + Wl 



20 



PiP - 1) (In ((1 - p)/p)) 2 log 2 e / 
Using Taylor series approximation 

1 1 



v/T+x = i + \x - ^x 2 + o( x 3 ) , 



and 



H 2 (x) = log 2 e- 



i-x 

1 1 \ log 2 e 



for small values of \, this becomes 

/ 1 — p x 
x = p(p — 1) In 1 

-1 + 1 + 



i-x xj x(x-i) 



Therefore, 



p(p -1) (In ((1-p) /p)) 2 log 2 e 



2 p2(p-l) a (In((l-p)/p))*(log 3 e)a 



0(9 S 



H 2 (p + x) = 
H 2 (p) + log 2 



log 2 ((l -p)/p) 



1-p 



log„ e x 



+ Oix 3 ) 



p j pip - 1) 2 

By applying the inverse of the binary entropy function on both sides 
of the equation, 



^ 3+0(* 3 ). (25) 



2 p(p-l)(Iog a ((l-p)/p)) 

We substitute the evaluation of value of x in {25} into the equa- 
tion J24b . Thus, we obtain 



p + z = H 2 1 (H 2 (p + a ;)) 

= rWhlaW+log, 



1-P 

Pip - 1) 2 

Denote by (9 the value of log 2 f ' 25 + p°p-i) ' %"> mus obtaining 
p + .T = H 2 1 (H 2 (p) + 6' + 0(x 3 )) . (24) 
By solving the quadratic equation 



H 2 - 1 (H 2 (p) + e + o(e 3 ))=p + i — — 



log 2 ((l -p)/p) 



1 °^ e + O(0 S ). (26) 



= In 



•sr + 



Pip - 1) 2 



• log 2 e , 



or equivalently 



2 p(p-l)(log 2 ((l-p)/p)) 

If p < | is fixed and 9 is small, then the value of H 2 (p) + 9 is 
bounded away from 1. In this case, the derivative of H 2 ~ 1 (x) at point 
X = H 2 (p) + 9 is bounded, and, therefore 

H 2 1 (H 2 (p) + + O(0 3 )) = H 2 1 (H 2 (p) + 9) + O(0 3 ) . 

Then, the equality l !26t becomes 

H 2 1 (H 2 (p) + 0)=p + 



x + 2p(p- l)ln 



l-p^ 26>p(p-l) 
log 2 e 



2 log 2 e 



1 



2 p(p-l)(log 2 ((l-p)/p))' 



log 2 ((l -p)/p) 
+ 0(9 3 ) . 
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Finally, we substitute — e(l — H2(p)) and receive that 

e(l-H 2 (p)) 



H 2 - 1 (H 2 (p) + e(l-H 2 (p)))=p + 



log 2 ((l-p)/p) 



_1 £ 2 (l-H 2 (p)) 2 log 2 e 
2'p(p-l)(log 2 ((l-p)/p)) 3 
thus completing the proof of the lemma. 



+ 0(e A ) , 
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